Goto

Collaborating Authors

 Bình Định Province


A Clinician-Friendly Platform for Ophthalmic Image Analysis Without Technical Barriers

Wang, Meng, Lin, Tian, Hou, Qingshan, Lin, Aidi, Wang, Jingcheng, Peng, Qingsheng, Nguyen, Truong X., Fang, Danqi, Zou, Ke, Xu, Ting, Xue, Cancan, Quek, Ten Cheer, Yu, Qinkai, Liu, Minxin, Zhou, Hui, Xiao, Zixuan, He, Guiqin, Liang, Huiyu, Shi, Tingkun, Chen, Man, Liu, Linna, Peng, Yuanyuan, Wang, Lianyu, Hu, Qiuming, Chen, Junhong, Zhang, Zhenhua, Chen, Cheng, Zhao, Yitian, Liu, Dianbo, Wu, Jianhua, Chen, Xinjian, Zhang, Changqing, Nguyen, Triet Thanh, Meng, Yanda, Zheng, Yalin, Tham, Yih Chung, Cheung, Carol Y., Fu, Huazhu, Chen, Haoyu, Cheng, Ching-Yu

arXiv.org Artificial Intelligence

Artificial intelligence (AI) shows remarkable potential in medical imaging diagnostics, yet most current models require retraining when applied across different clinical settings, limiting their scalability. We introduce GlobeReady, a clinician-friendly AI platform that enables fundus disease diagnosis that operates without retraining, fine-tuning, or the needs for technical expertise. GlobeReady demonstrates high accuracy across imaging modalities: 93.9-98.5% for 11 fundus diseases using color fundus photographs (CPFs) and 87.2-92.7% for 15 fundus diseases using optic coherence tomography (OCT) scans. By leveraging training-free local feature augmentation, GlobeReady platform effectively mitigates domain shifts across centers and populations, achieving accuracies of 88.9-97.4% across five centers on average in China, 86.3-96.9% in Vietnam, and 73.4-91.0% in Singapore, and 90.2-98.9% in the UK. Incorporating a bulit-in confidence-quantifiable diagnostic mechanism further enhances the platform's accuracy to 94.9-99.4% with CFPs and 88.2-96.2% with OCT, while enabling identification of out-of-distribution cases with 86.3% accuracy across 49 common and rare fundus diseases using CFPs, and 90.6% accuracy across 13 diseases using OCT. Clinicians from countries rated GlobeReady highly for usability and clinical relevance (average score 4.6/5). These findings demonstrate GlobeReady's robustness, generalizability and potential to support global ophthalmic care without technical barriers.


Towards Cultural Bridge by Bahnaric-Vietnamese Translation Using Transfer Learning of Sequence-To-Sequence Pre-training Language Model

Dat, Phan Tran Minh, Khang, Vo Hoang Nhat, Tho, Quan Thanh

arXiv.org Artificial Intelligence

This work explores the journey towards achieving Bahnaric-Vietnamese translation for the sake of culturally bridging the two ethnic groups in Vietnam. However, translating from Bahnaric to Vietnamese also encounters some difficulties. The most prominent challenge is the lack of available original Bahnaric resources source language, including vocabulary, grammar, dialogue patterns and bilingual corpus, which hinders the data collection process for training. To address this, we leverage a transfer learning approach using sequence-to-sequence pre-training language model. First of all, we leverage a pre-trained Vietnamese language model to capture the characteristics of this language. Especially, to further serve the purpose of machine translation, we aim for a sequence-to-sequence model, not encoder-only like BERT or decoder-only like GPT. Taking advantage of significant similarity between the two languages, we continue training the model with the currently limited bilingual resources of Vietnamese-Bahnaric text to perform the transfer learning from language model to machine translation. Thus, this approach can help to handle the problem of imbalanced resources between two languages, while also optimizing the training and computational processes. Additionally, we also enhanced the datasets using data augmentation to generate additional resources and defined some heuristic methods to help the translation more precise. Our approach has been validated to be highly effective for the Bahnaric-Vietnamese translation model, contributing to the expansion and preservation of languages, and facilitating better mutual understanding between the two ethnic people.


Innovative Silicosis and Pneumonia Classification: Leveraging Graph Transformer Post-hoc Modeling and Ensemble Techniques

Bui, Bao Q., Nguyen, Tien T. T., Le, Duy M., Tran, Cong, Pham, Cuong

arXiv.org Artificial Intelligence

This paper presents a comprehensive study on the classification and detection of Silicosis-related lung inflammation. Our main contributions include 1) the creation of a newly curated chest X-ray (CXR) image dataset named SVBCX that is tailored to the nuances of lung inflammation caused by distinct agents, providing a valuable resource for silicosis and pneumonia research community; and 2) we propose a novel deep-learning architecture that integrates graph transformer networks alongside a traditional deep neural network module for the effective classification of silicosis and pneumonia. Additionally, we employ the Balanced Cross-Entropy (BalCE) as a loss function to ensure more uniform learning across different classes, enhancing the model's ability to discern subtle differences in lung conditions. The proposed model architecture and loss function selection aim to improve the accuracy and reliability of inflammation detection, particularly in the context of Silicosis. Furthermore, our research explores the efficacy of an ensemble approach that combines the strengths of diverse model architectures. Experimental results on the constructed dataset demonstrate promising outcomes, showcasing substantial enhancements compared to baseline models. The ensemble of models achieves a macro-F1 score of 0.9749 and AUC ROC scores exceeding 0.99 for each class, underscoring the effectiveness of our approach in accurate and robust lung inflammation classification.


Multi-Dialect Vietnamese: Task, Dataset, Baseline Models and Challenges

Van Dinh, Nguyen, Dang, Thanh Chi, Nguyen, Luan Thanh, Van Nguyen, Kiet

arXiv.org Artificial Intelligence

Vietnamese, a low-resource language, is typically categorized into three primary dialect groups that belong to Northern, Central, and Southern Vietnam. However, each province within these regions exhibits its own distinct pronunciation variations. Despite the existence of various speech recognition datasets, none of them has provided a fine-grained classification of the 63 dialects specific to individual provinces of Vietnam. To address this gap, we introduce Vietnamese Multi-Dialect (ViMD) dataset, a novel comprehensive dataset capturing the rich diversity of 63 provincial dialects spoken across Vietnam. Our dataset comprises 102.56 hours of audio, consisting of approximately 19,000 utterances, and the associated transcripts contain over 1.2 million words. To provide benchmarks and simultaneously demonstrate the challenges of our dataset, we fine-tune state-of-the-art pre-trained models for two downstream tasks: (1) Dialect identification and (2) Speech recognition. The empirical results suggest two implications including the influence of geographical factors on dialects, and the constraints of current approaches in speech recognition tasks involving multi-dialect speech data. Our dataset is available for research purposes.


VlogQA: Task, Dataset, and Baseline Models for Vietnamese Spoken-Based Machine Reading Comprehension

Ngo, Thinh Phuoc, Dang, Khoa Tran Anh, Luu, Son T., Van Nguyen, Kiet, Nguyen, Ngan Luu-Thuy

arXiv.org Artificial Intelligence

This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension (MRC) tasks and provides insights into the challenges and opportunities associated with using real-world data for machine reading comprehension tasks. The existing MRC corpora in Vietnamese mainly focus on formal written documents such as Wikipedia articles, online newspapers, or textbooks. In contrast, the VlogQA consists of 10,076 question-answer pairs based on 1,230 transcript documents sourced from YouTube -- an extensive source of user-uploaded content, covering the topics of food and travel. By capturing the spoken language of native Vietnamese speakers in natural settings, an obscure corner overlooked in Vietnamese research, the corpus provides a valuable resource for future research in reading comprehension tasks for the Vietnamese language. Regarding performance evaluation, our deep-learning models achieved the highest F1 score of 75.34% on the test set, indicating significant progress in machine reading comprehension for Vietnamese spoken language data. In terms of EM, the highest score we accomplished is 53.97%, which reflects the challenge in processing spoken-based content and highlights the need for further improvement.


ViLexNorm: A Lexical Normalization Corpus for Vietnamese Social Media Text

Nguyen, Thanh-Nhi, Le, Thanh-Phong, Van Nguyen, Kiet

arXiv.org Artificial Intelligence

Lexical normalization, a fundamental task in Natural Language Processing (NLP), involves the transformation of words into their canonical forms. This process has been proven to benefit various downstream NLP tasks greatly. In this work, we introduce Vietnamese Lexical Normalization (ViLexNorm), the first-ever corpus developed for the Vietnamese lexical normalization task. The corpus comprises over 10,000 pairs of sentences meticulously annotated by human annotators, sourced from public comments on Vietnam's most popular social media platforms. Various methods were used to evaluate our corpus, and the best-performing system achieved a result of 57.74% using the Error Reduction Rate (ERR) metric (van der Goot, 2019a) with the Leave-As-Is (LAI) baseline. For extrinsic evaluation, employing the model trained on ViLexNorm demonstrates the positive impact of the Vietnamese lexical normalization task on other NLP tasks. Our corpus is publicly available exclusively for research purposes.


A Novel Explainable Artificial Intelligence Model in Image Classification problem

Cao, Quoc Hung, Nguyen, Truong Thanh Hung, Nguyen, Vo Thanh Khang, Nguyen, Xuan Phong

arXiv.org Artificial Intelligence

In recent years, artificial intelligence is increasingly being applied widely in many different fields and has a profound and direct impact on human life. Following this is the need to understand the principles of the model making predictions. Since most of the current high-precision models are black boxes, neither the AI scientist nor the end-user deeply understands what's going on inside these models. Therefore, many algorithms are studied for the purpose of explaining AI models, especially those in the problem of image classification in the field of computer vision such as LIME, CAM, GradCAM. However, these algorithms still have limitations such as LIME's long execution time and CAM's confusing interpretation of concreteness and clarity. Therefore, in this paper, we propose a new method called Segmentation - Class Activation Mapping (SeCAM) that combines the advantages of these algorithms above, while at the same time overcoming their disadvantages. We tested this algorithm with various models, including ResNet50, Inception-v3, VGG16 from ImageNet Large Scale Visual Recognition Challenge (ILSVRC) data set. Outstanding results when the algorithm has met all the requirements for a specific explanation in a remarkably concise time.


Eyenuk AI Techlology Selected for Diabetic Eye Testing in Vietnam Supported by The Fred Hollows Foundation

#artificialintelligence

Eyenuk, Inc., a global artificial intelligence (AI) medical technology and services company and the leader in real-world applications for AI Eye Screening, announced that its EyeArt AI system for diabetic eye testing has been chosen for deployment in 4 hospitals in Binh Dinh Province, Vietnam. The project is funded by The Fred Hollows Foundation. "We are excited to implement the EyeArt AI system to expand our capabilities in detection and treatment of diabetic retinopathy. It will help us reach our goal to protect the vision of approximately six million people with diabetes living in Vietnam," said Pham Quoc Anh, Vietnam Country Manager for The Fred Hollows Foundation. "This important project will help us continue the work first started by Professor Fred Hollows 29 years ago, fulfilling his vision to bring equitable eye health for all."